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Abstract 



qj • For point to point multiple input multiple output systems, Dayal-Brehler-Varanasi have proved 



that training codes achieve the same diversity order as that of the underlying coherent space time 
block code (STBC) if a simple minimum mean squared error estimate of the channel formed using 
the training part is employed for coherent detection of the underlying STBC. In this letter, a similar 
strategy involving a combination of training, channel estimation and detection in conjunction with 
existing coherent distributed STBCs is proposed for noncoherent communication in AF relay net- 

p,i works. Simulation results show that the proposed simple strategy outperforms distributed differential 

space-time coding for AF relay networks. Finally, the proposed strategy is extended to asynchronous 

*-0 ■ relay networks using orthogonal frequency division multiplexing. 
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I. INTRODUCTION 

Recently the idea of space time coding has been applied in wireless relay networks in 
the name of distributed space time coding to extract similar benefit as in point to point 
multiple input multiple output (MIMO) systems. Mainly there are two types of distributed 
space time coding techniques discussed in the literature: (i) decode and forward (DF) based 
distributed space time coding [1], wherein a subset (chosen based on some criteria) of the 
relay nodes decode the symbols from the source and transmit a row/columry of a distributed 
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'Whether the relay transmits a column of a STBC or a row of a STBC depends on the system model. 
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space time block code (STBC) and (ii) amplify and forward (AF) based distributed space 
time coding [2], where all the relay nodes perform linear processing on the received symbols 
according to a distributed space time block code (DSTBC) and transmit the resulting symbols 
to the destination. AF based distributed space time coding is of special interest because 
the operations at the relay nodes are greatly simplified and moreover there is no need for 
every relay node to inform the destination once every quasi-static duration whether it will be 
participating in the distributed space time coding process as is the case in DF based distributed 
space time coding [1]. However, in [2], the destination was assumed to have perfect knowledge 
of all the channel fading gains from the source to the relays and those from the relays to 
the destination. To overcome the need for channel knowledge, distributed differential space 
time coding was studied in [3], [4], [5], [6], which is essentially an extension of differential 
unitary space time coding for point to point MEMO systems to the relay network case. But 
distributed differential space time block code (DDSTBC) design is difficult compared to 
coherent DSTBC design because of the extra stringent conditions (we refer readers to [4], 
[6] for exact conditions) that need to be met by the codes. Moreover, all the codes in [3], 
[4], [5] for more than two relays have exponential encoding complexity. On the other hand, 
coherent DSTBCs with reduced maximum likelihood (ML) decoding complexity are available 
in [8], [15], [18]. 

Interestingly in [9], it was proved that for point to point MIMO systems, training codesj 
achieve the same diversity order as that of the underlying coherent STBC. This was shown to 
be possible if a simple minimum mean squared error (MMSE) estimate of the channel formed 
using the training part of the code is employed for coherent detection of the underlying STBC. 
The contributions of this letter are summarized as follows. 

• Motivated by the results of [9], a similar training and channel estimation scheme is 
proposed to be used in conjunction with coherent distributed space time coding in AF 
relay networks as described in [2]. An interesting feature of the proposed training scheme 
is that the relay nodes do not perform any channel estimation using the training symbols 
transmitted by the source but instead simply amplify and forward the received training 
symbols. The proposed strategy is shown to outperform the best known DDSTBCs [3], 
[4], [5], [6] using simulations. Also, it is shown that appropriate power allocation among 

2 Each codeword of a training code consists of a part known to the receiver (pilot) and a part that contains codeword(s) 
of a STBC designed for the coherent channel (in which receiver has perfect knowledge of the channel) 
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the training and data symbols can further improve the error performance marginally. 
• Finally, this training based strategy is extended to asynchronous relay networks with 
no knowledge of the timing errors using the recently proposed Orthogonal Frequency 
Division Multiplexing (OFDM) based distributed space time coding [7]. 
The rest of this letter is organized as follows. The proposed training scheme along with 
channel estimation is described in Section [III Extension to the asynchronous relay network 
case is addressed in Section HID Simulation results comprise Section [IV] and a conclusions 
are presented in Section [V] 

Notation: Vectors and matrices are represented by lowercase and uppercase boldface 
characters respectively. An identity matrix of size N x N will be denoted by In- A complex 
Gaussian vector with zero mean and covariance matrix J7 will be denoted by CA/"(0, Ct). 

II. Proposed Training Based Strategy 

In this section, we briefly review the distributed space time coding protocol for AF relay 
networks in [2], make some crucial observations and then proceed to describe the proposed 
training based strategy. 

Consider a wireless relay network consisting of a source node, a destination node and 
R relay nodes U\, C/jj, . . . , Ur which aid the source in communicating information to the 
destination. All the nodes are assumed to be equipped with a half duplex constrained, single 
antenna transceiver. The wireless channels between the terminals are assumed to be quasi- 
static and flat fading. The channel fading gains from the source to the i-th relay, fa and those 
from the i-th relay to the destination gi are all assumed to be independent and identically 
distributed (i.i.d) complex Gaussian random variables with zero mean and unit variance. 
Symbol synchronization and carrier frequency synchronization are assumed among all the 
nodes. 

A. Observations from Coherent Distributed Space Time Coding 

In order to explain coherent distributed space time coding, we shall assume in this sub- 
section alone that the destination has perfect knowledge of all the channel fading gains 

fi,gi, i — 1, . . . , R. Every transmission cycle from the source to the destination is comprised 

r 1 T 

of two phases. In the first phase, the source transmits a vector z = z\ Z2 ... zt 

composed of T x complex symbols Zi, i = 1, . . . , T\ to all the R relays using a fraction 7Ti of 

the total power P d for data transmission. The vector z satisfies Elz^z] = Ti and P d denotes 
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the total average power spent by the source and the relays for communicating data to the 
destination. The received vector at the 2-th relay is then given by r-j = yfKiPdfii + v i where, 
Vj ~ CA/"(0, Itx) represents the additive noise at the z'-th relay. 

In the second phase, the i-th relay transmits tj = w ^p^ B^ or tj = w ^p^- B^r* to 
the destination, where Bj G C T2xTl is called the 'relay matrix'. Without loss of generality 
we may assume that the first M relays linearly process r-j and the remaining R — M relays 
linearly process r ; *. Under the assumption that the quasi-static duration of the channel is much 
greater than 2R channel uses, the received vector at the destination can be expressed as y = 

Biz B M z B M +iz* B R z* 



Ef=i 9iU + w=J ^^Xh + n where, X 



fiQi h92 ■ ■ ■ Im9m Im+i9m+i ■ ■ ■ Ir9r 



T 



(1) 



n - " 2P < 



&T (l2tLi9iBiVi + Ef=M+i9iBiV*) + w and w ~ CAT(0,I T2 ) represents the 



niP d 
additive noise at the destination. The power allocation factors 7Ti and n 2 are chosen to satisfy 

TXxPd + TT 2 PdR = 2Pd- The covariance matrix of n is given by T = E[nn ff ] = I T2 + 

J 2 pii (Z)^=i \9i\ 2 ^i^f)- Let tne DSTBC ^ denote the set of all possible codeword matrices 

X. Then the ML decoder is given by 



X = M gmi ? ||r-»(y-y-Sf^Xh)| & . < 2 > 

Note from © that the ML decoder in general requires the knowledge^ of all the channel 
fading gains /j, g i} i — 1, . . . , R. Consider the following decoder: 



X = a r gmin||y-y^Xh|||. (3) 

Remark 1: The decoder in © is suboptimal in general and coincides with the ML decoder 
for the case when T is a scaled identity matrix. The relay matrices for all the codes in [2], 
[15], [17], [18] and some of the codes in [8] are unitary. For the case when BjBf^ is a 
diagonal matrix for all i = 1, 2, . . . , R (T is a diagonal matrix for this case), the performance 
of the suboptimal decoder in © differs from that of the ML decoder © only by coding 
gain and the diversity gain is retained. This can be proved on similar lines as in the proof of 
Theorem 7 in [8]. The class of DSTBCs from precoded co-ordinate interleaved orthogonal 
designs in [8] is an example for the case of diagonal T matrix. 

3 r requires knowledge of the g^s and h requires knowledge of fiQi, i — 1, . . . , M and f*gt, i = M + 1, . . . , R which 
together imply knowledge of /;, Qi, i = 1, . . . , R. 
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The decoder in © requires only the knowledge of h and not necessarily the knowledge 
of all the individual channel fading gains fa, g^ % = 1, 2, . . . , R. The training strategy to be 
described in the sequel essentially exploits this crucial observation. 

B. Training cycle 

Note from the previous subsection that one data transmission cycle comprises of T\ + T 2 
channel uses. In the proposed training strategy, we introduce a training cycle comprising of 
R + 1 channel uses for channel estimation before the start of data transmission cycle. We 
assume that the quasi-static duration of the channel is greater than (R + 1) + F(T\ + T 2 ) 
channel uses where, F denotes the total number of data transmission cycles that can be 
accommodated within the channel quasi-static duration. Thus, for F — 1, T\ — T 2 = R, 
the minimum channel quasi-static duration required for the proposed strategy is 3R + 1 
channel uses. Let P t be the total average power spent by the source and the relays during 

the training cycle. Thus, the total average power P used by the source and the relays is 

p PtjR+v+PdFjn+Ti) 

r R+l+F{Ti+T 2 ) ■ 

In the first phase of the training cycle, the source transmits the complex number 1 to all the 
relays using a fraction tx\ of the total power P t dedicated for training. The received symbol 



at the z-th relay denoted by fj is given by f, = yfiVyPtfi + $i where £>; ~ CJ\f(0, 1) is the 
additive noise at the i-th relay. 

The second phase of the training cycle comprises of R channel uses, out of which one 
channel use is assigned to every relay node. Without loss of generality, we may assume that 
the z'-th time slot is assigned to the z'-th relay. Furthermore, we assume that the value of M 
to be used during the data transmission cycle is already decided. During its assigned time 



At the end of the training cycle, the received vector y at the destination is given as follows: 



7Tl Pt + 1 



w, h is same as 



where, n = \J fj^ y g x v x . . . g M v M g M +ivi I+1 . . . g R v R 

that given in (Q~|) and w ~ CJ\f(0, Ir.) is the additive noise at the destination. The entire 

transmission from source to destination is illustrated pictorially in Fig. Q] and Fig. [2] 

Note that the entries of h as well as fi are not complex Gaussian distributed since they 
involve terms that are product of complex Gaussian random variables. To be precise, the 
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entries of h are i.i.d random variables with mean and variance 1. Similarly, the entries 
of n t are i.i.d random variables with mean and variance (^p^rj + !)• For the point to 
point MEMO case, where the channel and additive receiver noise are modeled as complex 
Gaussian, Dayal-Brehler-Varanasi in [9] have proposed a simple linear channel estimator. In 
this letter, we propose to employ a similar estimator for the equivalent channel h as follows: 



?R + 1 \ *- (5) 



TTxPi + i V mPt + i 

Now using the estimate h, coherent DSTBC decoding can be done in every data transmis- 
sion cycle, as X = argminxe^ || y — y ^n- Xh |||.. Thus, coherent DSTBCs [8], [15], 
[16], [17], [18] can be employed in noncoherent relay networks via the proposed training 
scheme. We would like to mention that there may be better channel estimation techniques 
than the one described by ©, but this is beyond the scope of this letter. However, the 
simulation results in section |V] show that a simple channel estimator as in © is good enough 
to outperform the best known DDSTBCs. 

III. Training Strategy For Asynchronous Relay Networks 

The training strategy described in the previous section assumes that the transmissions 
from all the relays are symbol synchronous with reference to the destination. In this section, 
we relax this assumption and extend the proposed training strategy to asynchronous relay 
networks with no knowledge of the timing errors of the relay transmissions. However we 
shall assume that the maximum of the relative timing errors from the source to the destination 
is known. An asynchronous wireless relay network is depicted in Fig[51 Let Tj denote the 
overall relative timing error of the signals arrived at the destination node from the i-th relay 
node. Without loss of generality, we assume that n = 0, r i+ i > Ti,i — 1, . . . , R — 1. 

Recently there have been several works [7], [8], [10], [11], [12], [13], [14] on distributed 
space time coding for asynchronous relay networks, some of which employ OFDM. The 
proposed scheme relies on the OFDM based distributed space time coding in [7], [8], which 
is essentially distributed space time coding over OFDM symbols and the cyclic prefix (CP) 
of OFDM is used to mitigate the effects of symbol asynchronism. The number of sub-carriers 
N and the length of the cyclic prefix (CP) l cp are chosen such that l cp > max^i^,...^^}. 
The channel quasi-static duration assumed for this strategy is ((R + 1) + F(2R)) (N + l cp ) 
channel uses. 
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As for the synchronous case, there will be a training cycle before the start of data 
transmission from the source. In the first phase of the training cycle, the source takes the N 



point inverse discrete Fourier transform (IDFT) of the N length vector p 



1 1 



T 



and adds a CP of length l cp to form a OFDM symbol p. This OFDM symbol is transmitted to 



the relays using a fraction 7Ti of the total power P t . The i-th relay receives f { = v / ^i^/iP+ v* 
where Vj ~ £A/"(0, In+i cp ) is the additive noise at the i-th relay. The second phase of 
the training cycle comprises of R OFDM time slots and the i-th relay is allotted the i-th 
OFDM time slot for transmission. During its scheduled time slot, the i-th relay transmits 



K2 RPt f 

TTlPt+1 1 *' 



if KM 



tj = < VJ2_* where £(•) denotes the time reversal operation, i.e., 

C(r(n)) = r(A r + l cp — n). The destination receives R OFDM symbols which are processed 
as follows: 

1) Remove the CP for the first M OFDM symbols. 

2) For the remaining OFDM symbols, remove CP to get a TV-length vector. Then shift 
the last lcp samples of the iV-length vector as the first l cp samples. 

Discrete Fourier transform (DFT) is then applied on the resulting R vectors to obtain 



yoj Vi,j 



Vn-ij 



, j = l,2,...,R.LetWj 



Woj Wij 



WN-lj 



T 



represent the additive noise at the destination node in the j'-th OFDM time slot and let 

-iT 

v ■ vij ... Vn-i j denote the DFT of Vj after CP removal. Note that a delay 
r in the time domain translates to a phase change of e~ LJ ^~ L in the k-th sub carrier. Now using 
the identities (DFT(x))* = IDFT(x*), (IDFT(x))* = DFT(x*), DFT(C(DFT(x))) = x, 
p* = p we have in the j-th OFDM time slot 



7T17T2 RP? 1 T . 

j-p o d j 




n 2 RPt 
TTiPt+1 

7T2 RPt 



9jV 3 



t o d T J + w 3 if j < M 



HTfP ° drj + V %maM o en + w, if j > M 



where, d T J 



i2-irrj(JV-l) 

e N 



T 



and o denotes Hadamard product. Thus, in 



each sub-carrier k, < k < N — 1, we get 



Yk 



Vk,l Uk,2 



Vk,R 



T 



TTl^RPt 2 

mP t + 1 



I n h 



Riife 



where, 



fi9i u T ^f 2 g 2 



U k M fMgM U k + /m+iS'M+I 



n fc 



uTf R 9R 



(6) 



(7) 
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u k 



n k 



N 


and 


J 


TT 2 PtR 


V 


■mPt+1 



+ 



K'giVk,! ■•• u T k M g M Vk,M u k M+1 g M+1 vl M+1 ... u T k R g R vl R 

iT 
W k ,l W k ,2 ■ ■ ■ Wh R 



Analogous to the synchronous case, we propose to estimate the equivalent channel matrix 
h k from § as h fc = ^/^^ ( ^ggg^ + l) _1 y fc . After the training cycle, the 
data transmission cycle starts for which refer the readers to [7] and section IV of [8] for 
a detailed explanation. In essence, a DSTBC is seen by the destination in every sub-carrier 
and the equivalent channel seen by the destination in the A>th sub-carrier is precisely the 
matrix h k , whose estimated value is available at the end of the training cycle. As for the 
synchronous case (see ©), we propose to ignore the covariance matrix of the equivalent 
noise while performing data detection . 

IV. Simulation Results 

In this section, simulations are used to compare the error performance of the proposed 
strategy against the best known DDSTBC for 2 relays[4] and 4 relays[6]. Note that for 4 
relays, the DDSTBCs in [6] were shown to outperform the codes reported in [3], [4], [5] in 
both complexity as well as performance. For all the simulations, we set i\\ = 1, tt 2 = -^ (as 
suggested in [2]), T\ = T 2 = 4 and F = 50. The channel fading gains /j, g h i — 1, . . . , R are 
each generated independently following a complex Gaussian distribution with mean and 
unit varianceo The decoder used for the proposed scheme is the one described by © and 
for the DDSTBC case, the decoder proposed in [6] has been used. We chose P t — (1 + a)Pd, 
where a denotes the power boost factor to allow for power boosting to the pilot symbols. 
In order to quantify the loss in error performance due to channel estimation errors in the 
proposed strategy, the performance of the corresponding coherent DSTBC (assuming perfect 
channel knowledge) is taken as the reference. 

For a 2 relay network, the Alamouti code is applied both as a DDSTBC[4] and as the 
underlying coherent STBC in the proposed training scheme. The signal constellation is 
chosen to be 4-QAM and 16-QAM for rates of 1 and 2 bpcu respectively. Fig. [3] shows the 
error performance of the proposed strategy in comparison with Alamouti DDSTBC and the 

4 This is a suitable assumption for the case when the relays are approximately equidistant from both the source as well 
as the destination. 
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corresponding coherent DSTBC for a = and transmission ratesj of 1 bits per channel use 
(bpcu) and 2 bpcu respectively. It can be observed that the proposed scheme has marginally 
better performance compared to the DDSTBC strategy for transmission rates of 1 and 2 bpcu. 
Note that the performance advantage of the proposed strategy over the DDSTBC strategy is 
more for the 2 bpcu case. 

For a 4 relay network, the coherent DSTBC employed in the proposed strategy for simula- 



tions is 



~2 


~4 


-4 


Z\ 


-4 


-4 


z l 


4 


4 



where {Re(2a), Re(z 2 )}, {Re(z 3 ), Re^)}, {ImOi), Im^)} 



z i z 3 z 2 z l 

and {Imf^), Im^)} take values from quadrature amplitude modulation (QAM) rotated by 
166.7078° (QAM constellation size chosen based on transmission rate). The relay matrices 
corresponding to this coherent DSTBC are unitary and M — 2. The DDSTBC taken for 
comparison is the one reported recently in [6]. It can be observed from Fig. 0] that for a 
rate of 1 bpcu and codeword error rate (CER) of 1CT 5 , the proposed strategy outperforms 
the DDSTBC of [6] by approximately 2 dB for a = 0. For a transmission rate of 2 bpcu, 
the performance gap between the proposed strategy and the DDSTBC of [6] increases to 8 
dB. Finally, observe that a 40% power boost to the pilot symbols gives marginally better 
performance (gain of 0.7 dB). 

From all the above simulations, we infer that the performance advantage of the proposed 
strategy over DDSTBCs increases as the transmission rate increases. Also, note that the 
proposed strategy is better than the DDSTBCs of [6], [4] at all signal to noise ratio (SNR). 
In spite of the simple channel estimation method employed (Eq. ©), note from Figf3] and 
FigJH that the performance loss due to channel estimation errors is only about 3 dB for 
transmission rates of 1 and 2 bpcu respectively. We can attribute three reasons for the proposed 
strategy to outperform DDSTBCs as follows: (1) lesser equivalent noise power seen by the 
destination during data transmission cycle as compared to distributed differential space time 
coding [3], [4], [5], [6], (2) no restriction of coherent DSTBC codewords to unitary/scaled 
unitary matrices as is the case with DDSTBCs [3], [4], [5], [6] and (3) the relay matrices 
Bj, i = 1, 2, . . . , R need not satisfy certain algebraic relations involving the codewords (see 
[4], [6] for exact relations), thus giving more room to optimize the minimum determinant of 

5 When calculating transmission rate, the rate loss due to initial few channel uses for training is ignored (R + 1 for 
proposed strategy and 2R for DDSTBC [3], [4], [5], [6]). 
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difference matrices (coding gain). 

Simulation results are not reported for the asynchronous case because the use of OFDM 
essentially makes the signal model in every sub-carrier similar to the synchronous case. Except 
for a rate loss due to CP, the performance will thus be same as that for the synchronous case. 

V. Conclusion 

Similar to the results of [9] for point to point MIMO systems, a simple training and channel 
estimation scheme combined with the protocol in [2] was shown to outperform distributed 
differential space time coding at all SNR. The proposed strategy leverages existing coherent 
DSTBCs [8], [15], [16], [17], [18] for noncoherent communication in AF relay networks. 
Finally, the proposed strategy is extended for application in asynchronous relay networks with 
no knowledge of the timing errors using OFDM. Some of the interesting directions for further 
work are: (1) design of optimal training sequences, (2) better channel estimation techniques 
and (3) optimal power allocation between the training cycle and the data transmission cycle. 
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Fig. 4. Error performance comparison for a 4 relay network 




Fig. 5. Asynchronous wireless relay network 
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